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The High School & Beyond Data Set: Academic Self-Concept Measures 

William Strein 
University of Maryland at College Park 

Summary, . A series of confirmatory factor analyses using both 
LISREL VI (maximum liklihood method) and LISCOMP 
(weighted least squares method using covariance matrix based 
on polychoric correlations) and including cross-validation on 
independent samples were applied to items from the High 
School and Beyond data set to explore the measurement 
characteristics of a proposed set of academic self-concept 
measures. Results most strongly supported a first-order model 
with English Self-Concept (ESC) and Math Self-Concept (MSC) as 
uncorrected factors and General School Self-Concept as a factor 
correlating with ESC and MSC. Tests for invariance suggested 
that the model holds up across gender, but not across SES. 

Academic self-concipt (ASC) is a topic that is currently receiving con- 
siderable attention in the research press (e.g., Byrne & Shavelson, 1986; 
Licht, Stader, & Swenson, 1989; Mboya, 1989). While there aru undoubt- 
edly a variety of reasons for this resurgence of interest, pioneering theo- 
retical work by Shavelson and his colleagues (e.g., Shavelson, & Bolus, 
1982; Shavelson, Hubner, & Stanton, 1976; ) and voluminous empirical 
work by Marsh and others (e.g., Marsh, 1984, 1987, 1988a, 1988b; Marsh 
& Parker, 1984; Marsh, Parker, & Barnes, 1985; Marsh, Parker, & Smith, 
1983; Marsh & Shavelson, 1985) have contributed significantly to this 
thrust. One of the drawbacks in researching this area is that large samples 
are usually needed in order to identify the relatively small effect sizes 
connected with ASC. This is particularly troublesome because ASC is one of 
those "sensitive" variables that are often pragmatically hard to collect. 
Accordingly, the use of large archival data sets would seem to be particu- 
larly useful in this area. The High School and Beyond (HS&B) data set is one 
such source. Although it does not contain an ASu ^ale, per se, HS&B doe? 
contain several items that have face validity as measores of this construct. 
This paper reports the results of a series of confirmatory factor analyses 
testing whether a set of items from the HS&B data set can be validated as 
an ASC measure. These analyses are very similar, but not identical, to 
those reported by Marsh (1988a, b) (Please see Note 1.) 

Measurement Models for Academic Self-Concept. Several different 
models of the structure of self-concept exist (Byrne, 1984) with varying 
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degrees of support. The model that formed the conceptual basis for this 
study is the Shavelson (e.g. see Shavelson & Bolus, 1982) hierarchichal 
model with general self-concept at the apex, subject area-specific self- 
concepts on the lowest level, and academic self-concept(s) occupying the 
intermediate level. There is some conflicting evidence over whether this 
model contains a global ASC that subsumes English self-concept (ESC) and 
Math self-concept (MSC) or whether ESC and MSG are independent. Recent 
evidence tends to support the latter (Byrne & Shavelson, 1986, Marsh & 
Shavelson, 1985). Some models also include a "general school' 1 self-concept 
(GSSC) that is subsumed by both ESC and MSC. Various levels of 
nonacademic self-concept are also included. This paper explores only the 
ASC side of the model. 

Methodology 

Pre-planned Analyses. The analytic tool used for this project was 
confirmatory factor analysis (CFA) using the LISREL VI program (Joreskog 
& Sorbom, 1985). Since some of the analyses were "exploratory" in nature, 
a set of analyses that included cross-validation were pre-planned. This 
procedure jeduces post-hoc capitalization on chance findings. Specifically, 
the plan included: (a) CFA of a first-order correlated factors model 
including ESC, MSC and GSSC (Model I) (b) possible adjustments based on 
this analysis and resulting in Model(s) la, lb, etc., (c) cross-validation of 
the "best" first-order factors model on a separate sample, (d) construction 
of a hierarchical model (Model 2) on a separate sample and testing it 
against the first-order model to determine which model is most 
supportable, (e) cross-validation of Model 2, (f) testing of the resulting 
model for invariance across gender and a 3-levcl categorization of SES. 
Step (d) is only justified if a correlated first-order factor model is 
confirmed. 

Because the data are not of the interval variety usually associated 
with Pearson correlations and the resulting covariance matricies that were 
used in the analyses reported above, additional CFA analyses "'ere 
performed using the LISCOMP program (Muthen, 1988) that uses polyseric 
and polychoric correlations as its basic data. 

Procedures . A scan of the HS&B codebook produced 12 items as 
candidates for measurement variables corresponding to the GSSC, MSC and 
ESC latent variables. Four observed variables were tentatively matched to 
each of the three constructs, (See Figures 1 and 2). Using SPSSX utility 
procedures, five independent samples, four with N = 250 and one with N = 
500, evenly balanced by gender,, were randomly drawn from the 1980 
cohort of the HS&B base-year survey of 30,030 high schooi sophomores. 
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Since LISREL requires listwise deletion of missing data, exact NTs for each 
analysis varied. 

Figure 1: 

Observed Variables Corresponding to Latent Self-Concept Variables 

English Self-Concept 

El I am usually at ease in English class. (T/F) 

E2 Doing English assignments makes me feel tense. (T/F) 

E3 English class does not scare me at all. (T/F) 

E4 I dread English class. (T/F) 
Mathematics Self-Concept 

Ml I am usually at ease in Mathematics class. (T/F) 

M2 Doing Mathematics assignments makes me feel tense. (T/F) 

M3 Mathematics class does not scare me at all. (T/F) 

M4 I dread Mathematics ciass. (T/F) 
General School Self- Concept 

51 Others see you as a good student? (Very, somehwat, not at all) 

52 I am interested in school. (T/F) 

53 I like to work hard in school (T/F) 

54 Regardless of plans, ability to complete college? (5 -point Likert) 



Results 

Model 1: I nitial Analysis . Model 1 was a first-order correlated factors 
model with ESC, MSC and GSSC as latent variables. Each was uniquely 
represented by four observed variables (items) as per Figure 2. All 
possible correlations between the factors were allowed, but errors (i.e. 
uniquenesses) for the individual items were not allowed to correlate. 
Model testing using the LISREL VI program produced a highly significant 
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Figure 2: Models of Self-Concept Structure 
Model 1 






X 2 (see Table 1). Although this would seem to suggest a very poor fitting 
model, two factors must be considered: (a) all of the analyses in this 
project use relatively large samples (215 is the smallest) which produce 
large X 2 values and thus tend to over-identify cases with trivially small 
lack of fit, and (b) the X 2 statistic is sensitive to departures from normality, 
especially in conjunction with noninterval data, such as the dichotomously- 
scored and Likert items in this study (Jcreskog & Sorbom, 1988). 
Accordingly, the Goodness-of-Fit Index (GFI) (Joreskog & Sorbom, 1988), 
which is less influenced by nonnormality, the Root Mean Square Residual 
(RMR), the X 2 /df ratio - a flawed, but commonly used index -, and the 
number of normalized residuals > 12.01 will be emphasized in this paper, 
except when directly comparing a series of nested models. To compare 
nested models, Sobel and Bohrnstedt's (1985) "baseline model" approach 
will be used. In this procedure, a baseline model is identified based on the 
current state of research knowledge regarding the relationships of interest. 
Alternative models are then tested for improvements over the current 
state of knowledge. Sobel and Bohrnstedt argue convincingly for the use of 
this approach over the "null model" approach advocated by Bentler and 
Bonett (1980) whenever prior knowledge provides clear support for some 
structuring of the data. In the present case, model 1 was chosen as the 
baseline model based on the substantial amount of previous research 
referred to in the previous section. 

Table 1 

Fit Indicies for Validation and Cross-validation of Models (LISREL VI ML 
Method ) 



No. of 
Normalized 

Model Sample X 2 (df) p X 2 /df GFI RMR Resid. > 12.0! 



1 1 


139.62(51) 


.0001 


2.74 


.907 


.018 


6/78 


1 a 1 


112.16(49) 


.0001 


2.29 


.925 


.017 


4/78 


2 1 


112.84(50) 


.0001 


2.26 


.925 


.017 


8/78 


2 2 


89.35(50) 


.001 


1.79 


.934 


.016 


5/78 


2 (tau-eq) 2 


150.86(56) 


,0001 


2.69 


.904 


.037 


8/78 



Data for the fit indicies for Model 1 are displayed in Table 1. Taken 
collectively, the fit indicies suggested a model that approached an 
acceptable fit, but that clearly needed improvement. Analysis of 
normalized residuals (6 of 78 were > 12.01) and modification indicies 
suggested that the model could be improved by allowing the uniqueness 
terms for two of the ESC and MSC item pairs (E3, M3 / E4, M4) to correlate. 
Correlated errors should only be allowed when there is good reason to 
believe that the items share specific variance (often method variance) 
rather than indicating an unidentified factor (Wheaton, 1987). In the 
present case, the ESC and MSC items are identical , except for the "English" 
or "math" term. It therefore seems quite reasonable that each respective 
pair shares considerable specific variance. There is a particularly strong 
case for this assertion in regard to the item pairs in question, because they 
contain extreme wording ("English [math] doesn't scare me."; "I dread 
English [math]"). The analysis also indicated a nonsignificant correlation 
between MSC and ESC. 

Model la: Selected Correlated Uniquenesse s. Based on the initial 
results with Model 1, the model was retested on the same sample allowing 
correlated uniqueness terms for the two ESC and MSC item pairs. (See 
Table 1 for fit indicies). Comparison of the X 2 values for models 1 and la 
showed a significant (p < .01) decrease in the X 2 value, thus indicating 
significantly better fit. Using Model 1 as the "baseline" model, the 
incremental change index A =.197 (Sobel and Bohrnstedt,1985), indicating 
that Model la represents only a modest improvement over -Model 1, albeit 
a statistically significant improvement. This finding is also supported by 
constant, but modest, improvements in the other fit indicies. This model 
clearly deserves further consideration. Analysis of the remaining four 
significant residuals did not suggest any conceptually justifiable 
modifications in factor loadings or correlated errors, but once again the 
MSC/ESC correlation was nonsignificant. Constraining MSC and ESC to be 
independent is. justifiable in light of previous research (Marsh, Byrne, & 
Shavelson, 1988) supporting the independence of these constructs. A 
lodel (Model 2) with MSC and ESC uncorrected was thus chosen for 
further validation. As shown in Table 1, the fit indicies for Model 2 were 
virtually identical to those for Model la, except for an increase in the 
number of normalized residuals > 12.01. However, inspection of the 
normalized residuals in both models la and 2 shows a nearly identical 
pattern. In both cases the S3 variable accounts for a majority of the 
significant normalized residuals. 

Given the nearly identical fit indicies for Models la and 2, other 
research supporting the independence of ESC and MSC (Marsh, Byrne, & 
Shavelson, 1988), and that Model 2 is slightly more parsimonious that 
Model la, Model 2 was chosen for cross-validation. In view of the apparent 
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orthogonal relationship between two of the three constructs, a higher 
order model was not investigated. 

Cross-validation of Model 2. To cross-validate Model 2, a second 
sample (N = 219) was used in the LISREL analysis. The model held up well, 
producing better fit indicies (see Table 1) than in the previous sample. This 
congeneric model was further tested to see if it was tau-equivalent (i.e. 
equal factor loadings among each respective set of observed variables). 
With an increase of 6 degrees of freedom, the X 2 jumped by 61.51 points 
to 150.86, a clearl> significant difference. A tau-equivalent model can not 
be supported. Taken collectively at this step, the analyses supported a 
congeneric 3-factor model, with no correlation between ESC and MSC, a 
moderate correlation between ESC and GSSC (r * .40) and a low correlation 
between MSC and GSSC (r * .25) . These results are consistent with Marsh's 
(1988a) finding of the independence of the ESC and MSC latent constructs 
using the. same items as measurement variables. 

Comparison Across Gender. Since the model was explored and 
confirmed on samples that included both males and females, and since 
gender differences on such scales are plausible (Licht, Stader, & Swenson, 
1989), model la was tested for invariance across groups. Althou % \ Model 2 
was best-supported at this point, the less-restricted, less-informed model 
was used in the event that the factor correlations were different in the 
differeni groups. A separate sample of 215 males and 227 females was 
used for this series of analyses which which imposed increasingly more 
stringent equality constraints [see Joreskog & Sorbom(1988), and Benson & 
Tippets (1988) for documented examples of this analytic strategy]. As a 
review of Table 2 will show there is consistent evidence that the model 
holds up across gender; in no case did addition of an equality constraint 
significantly increase the X 2 value. Other fit iniHcies suggest adequate fit 
for both males and females. Consistent with previous results, MSC and ESC 
were not significantly correlated for either gender. Additionally, for 
females MSC did not appear to correlate significantly with GSSC, but this is 
unclear given that the overall analysis supported equal factor correlations 
across groups. 

Comparison of Model Across SES . Some literature (Marsh, Parker, & 
Smith, 1983) has suggested that ASC behaves differently in different SES 
groups. To investigate this issue, Model la was simultaneously tested 
across three levels of SES using a sample of 117 low, 203 middle, and 105 
high SES subjects. The definition of SES levels comes from a trichotomized 
variable in the HS&B data set. Inspection of Table 3 reveals that, by 
contrast to the analogous gender analyses, impositions of additional 
equality constraints significantly increased the X 2 value. The hypothesis of 
equal number of factors is tentatively retained, but factor loadings and 
correlations among factors may be different. The model tended to fit best 



Table 2 

Simultaneous Confirmatory Factor Analyses across Gender for Model 1 a 











GFI 


RMR 


Model 


X2 


df 


X2/df 


M/F 


M/F 


1 Equal number 
of factors 


176.09 


98 


1.80 


.93/. 95 


.02/. 01 


2 Eq. # of factors 
Equal loadings 


185.26 


107 


1.73 


.92/. 95 


.02/.01 


3 Eq. # of factors 
Equai loadings 
Eq. Uniquenesses 


197.37 


121 


1.63 


.92/. 95 


.02/.01 


4 Eq. # of factors 
Equal loadings 
Eq. Uniquenesses 
Eq. factor correl. 


206.44 


127 


1.62 


.92/. 94 


.03/.02 


Model Comparisons X 2 df 
Model 1 vs. Model 2 9.17 9 
Model 2 vs. Model 3 12.11 14 
Model 3 vs. Model 4 9.07 6 


Critical X2 
16.9 
23.7 
12.6 


(p < .05) 





with the low and middle groups, but worse with the high group. ESC and 
MSC remained uncorrected in all three groups. By contrast, GSC correlated 
significantly with ESC in only the low and middle groups, but correlated 
significantly with MSC only in the high group. However, these findings 
should be interpreted cautiously because they hive not been cross- 
validated and the subsample sizes of the low and high group? are 
moderate, at best. 

Nonnormalitv Issues . Of the 12 observed variables in this study, 10 
are dichotomous (true/false or yes/no), 1 is a 3-point Likert scale and 1 is 
a 5-point Likert scale. Clearly, the data are not of the interval variety 
usually associated with Pearson correlations and the resulting covariance 
matricies that were used as input data for all LISREL analyses in this 
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Table 3 

Simultaneous Confirmatory Factor Analyses across SLS for Model la 

GFI RMR 
Model X2 df X2/df Lo/Mid/Hi Lo/Mid/Hi 



1 Equal number 209.14 147 1.42 .92/.94/.90 .02/.01/.02 
of factors 

2 Eq. # of factors 249.83 165 1.5i .91/.94/.87 .03/.02/.03 
Equal loadings 

3 Eq. # of factors 299.40 193 1.55 .90/.94/.84 .04/.02/.03 
Equal loadings 

Eq. Uniquenesses 

4 Eq. # of factors 319.05 205 1.56 .S9/.93/.84 .06/.02/.04 
Equal loadings 

Eq. Uniquenesses 
Eq. factor correl. 

Model Comparisons X 2 df Significance Level 

Mode! 1 vs. Model 2 40.69 1 8 p > .01 

Model 2 vs. Model 3 49.57 2 8 p > .01 

Model 3 vs. Model 4 19.65 12 p < .05 



study. Joreskog and Sorbom (1988) strongly warn against using Pearson r's 
with such data. Accordingly, all of the results of this study, especially the 
X 2 values, must be viewed with caution. 

The recommended alternative procedure in the case of noninterval 
observed variables is to use polychoric coefficients and the weighted least 
squares (WLS) method of estimation rather than the maximum liklihood 
(ML) method incorporated in the standard LISREL procedure (Joreskog & 
Sorbom, 1988). Accordingly, Models 1, la and 2 were reanalyzed using the 
LISCOMP (Muthen, 1988) program based on polychoric correlations 
between tne measurement variables and the weighted least squares 
estimation procedure. In general, the LISCOMP results paralleled those 
from the more commonly-used LISREL procedure, but some differences 
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emerged (see Table 4). On data set 1, Models la and 2 both represented 
significant (.01) improvements in fit over Model 1 using the X 2 difference 
test, and decreases in the root mean square residuals. However, the 
incremental fit A's (.085 and .067, respectively) were considerably smaller 
than for the LISREL procedure. Cross-validation on sample 2 suggested 
acceptable fit, especially in view of the lowest RMR, but produced a higher 
X 2 value than that produced by the LISREL analysis on the same sample. 
This was unexpected, given the finding (Joreskog, & Sorbom, 1988) that 
the X 2 value is inflated for this kind of data. 

Table 4 

Fit Indicies for Validation and Cross-validation (LISCOMP WLS Method) 



Model 


Sample 


X2 (df) 


P 


XVdf 


RMR 


1 


1 


126.44(51) 


.0001 


2.48 


.191 


1 a 


1 


115.72(49) 


.0001 


2.36 


.178 


2 


1 


1 18.03(50) 


.0001 


2.36 


.182 


2 


2 


121.30(50) 


.0001 


2.43 


.168 



Discussion 



Comparison to Marsh (1988a, b) Studies . In a study focusing on the 
relationships between school average ability and academic outcomes and 
aspirations, Marsh (1988b) used the eight ESC and MSC items and three of 
the GSSC (SI, S?., S4) items as measures of a composite academic self- 
concept (ASC) variable. Subjects for the study were the 14,000+ 
respondents to the second follow-up of the HS&B sophomore cohort. As 
such, Marsh's (1988b) subjects are from a potentially different universe 
than that from which the subjects in the present study were drawn, in that 
the present study drew samples from the entire base year cohort of 
30,030 subjects. Although Marsh (1988b) does not inc'ade analysis of the 
measurement model used n this study, the relationships between the ASC 
composite variable and other variables in the larger structural model were 
consistent with those previously reported in the ASC literature, thus 
lending support to the use of these nine items as measles of ASC- 

ioI2 



The Marsh (1988a) study, which used the same data set as the 
(1988b) study but focused on influences on the formation of ESC and MSC, 
used the same eight measurement variables as did the present study. 
Marsh did not include a GSSC construct in this study. By contrast to the 
present study, Marsh allowed the uniquenesses of each respective set of 
ESC and MSC items to correlate. Consistent with the results of the present 
study, Marsh (1988a) found support for using the eight variables as 
measures of ESC and MSC and also found ESC and MSC to be uncorrected. 
Addressing the "naming" problem, Marsh concluded that these variables 
can be thought of as ESC and MSC measures, as contrasted to something 
like academic anxiety, given the relationships that he found between the 
hypothesized scales and other variables in the structural model. Based on 
the observed nonsignificant correlation between ESC and MSC, additional 
model-testing on the data set [unreported in Marsh (1988a)] and previous 
research, Marsh (1988a) concluded that the results "... provide further 
support for the inappropriateness of a single global measure of academic 
self-concept." (p. 17). 

Based on this set of analyses using several independent samples for 
cross-validation and invariance tests, it would seem that Model 2 is 
supportable, except perhaps for high SES students. This study did not 
provide data to address the "naming problem" (Wheaton, 1987^ tnat is, the 
comfirmatory factor analyses showed that the variables lep.it, lately may 
be considered to form three respective latent variables but the these 
analyses do not prove that each latent variable is that construct and not 
something else. However, the data provided in the Marsh (1988a) study 
that showed theoretically predictable relationships between the ESC and 
MSC measures and other variables, such as academic achievement, lend 
support to the present interpretation of the ESC and MSC variables. Given 
the congruence of results between this study and Marsh (1988a), 
researchers may use these variables in the HS&B data set with some 
confidence. The status of the GSSC variable is less clear. Analysis of the 
residuals suggests that item S3 may be a poor candidate as a measure of 
this construct. Marsh (1988b) did not include this item in his composite 
measure of ASC. An additional study, excluding S3, but including the other 
variables would help to clarify this issue. 0 

The results of this study provide further confirmation of the 
independence of ESC and MSC (Marsh, Byrne & Shavelson, 1988) and for 
the invariance of the structure of self-concept across gender (Byrne, 1988), 
at least for adolescents. The possibility that the structure may differ across 
SES deserves further consideration. Marsh, Parker and Smith (1983) found 
higher correlations between ASC and academic achievement for high SES 
students than for lower SES students. Less work has been dene on the 
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possibility of structural differences in self-concept for these varying 
groups. 

Parallel analysis of the data by both the maximum liklihood (i.e., 
LISREL) and weighted least squares (i.e., LISCOMP) methods is an unusual 
feature of this study. Discussion of the methodological issues surrounding 
the use of these contrasting methods with noninterval data is beyond the 
scope of this paper, but two interesting findings emerge. First, the results 
are largely in agreement with one another regardless of the methodology 
used. Secondly, the X 2 for cross-validation on sample 2 was higher than for 
the same analysis using the maximum liklihood procedure. A lower value 
would be expected for such data (Joreskog & Sorbom, 1988). 
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